Skip to content

Add the country translation engines (ip-intelligence.translation)#136

Open
Jamesr51d wants to merge 1 commit into
mainfrom
feature/country-translation
Open

Add the country translation engines (ip-intelligence.translation)#136
Jamesr51d wants to merge 1 commit into
mainfrom
feature/country-translation

Conversation

@Jamesr51d

Copy link
Copy Markdown
Contributor

Add the country translation engines (ip-intelligence.translation)

Background

On-premise IP Intelligence returns CountryCodesGeographical and
CountryCodesPopulation as weighted lists of ISO country codes. The demos need
localized, complete, ordered country name and code lists so they can render a
country dropdown with the most probable country at the top. Cloud already
returns these properties in final form, so this work is on the on-premise path
only and the cloud engine is untouched. This is the Java port of the .NET
FiftyOne.IpIntelligence.Translation package, building on the new
pipeline.translation module.

Objectives

  • Engine 1: translate the weighted ISO codes from IP Intelligence to English
    country names.
  • Engine 2: translate those English names to the browser language and build
    complete, index-aligned, ordered "All" lists (most probable countries first
    by weight, then every remaining country alphabetically by translated name).
  • Localize by Accept-Language (or a query.translation override), defaulting
    to English when the language is English or unknown.

Key Decisions

  • New module ip-intelligence.translation, depending on the new
    pipeline.translation, plus ip-intelligence.shared for the
    IPIntelligenceData interface.
  • IP Intelligence element data key: as of pipeline change Fix cloud IP Intelligence engine: correct product key and support weighted/IP types #129 the on-premise
    and cloud engines return "ip", so that is the default source key for
    Engine 1. Because the key has changed between engine versions, Engine 1's
    source key is configurable
    (CountryCodeTranslationEngineBuilder.setSourceElementDataKey, default
    "ip"). Engine 2 reads the weighted codes by IPIntelligenceData type, so it
    is key-agnostic.
  • The translation YAML files are not shipped as new data. They are wired in at
    build time from the IP Intelligence data submodule
    (ip-intelligence-cxx/ip-intelligence-data/Translations) into the jar,
    mirroring the .NET project which links the same files.
  • Resources loads the files from the classpath by their known names, so
    loading is reliable from a directory or a packaged jar.

Changes

  • New ip-intelligence.translation module:
    • Constants, Resources.
    • data: ICountryCodeTranslationData / CountryCodeTranslationData,
      ICountriesTranslationData / CountriesTranslationData.
    • flowelements: CountryCodeTranslationEngine (+ builder, configurable
      source key), CountriesTranslationEngine (+ builder) with the ordering
      algorithm and the locale-aware Collator sort, recording the sorting
      culture in SortingCultureUsed.
  • pom.xml: register the module in <modules>; copy the translation YAML from
    the submodule at build time.

Testing

mvn -pl ip-intelligence.translation test (JDK 17, source/target 1.8). 22
tests, all passing, PMD and -Werror clean.

  • TranslationTests (21): the 12 tests ported from the .NET suite plus 9 gap
    tests (evidence-key precedence including query.translation overriding the
    header, weights preserved through translation, equal-weight tie break by
    name, unknown locale falling back to English, a missing single name staying
    English, full index alignment across the alphabetical tail,
    SortingCultureUsed, and population names differing from geographical).
    These use a real mini pipeline (a stub IP engine, then the two engines).

  • CountryTranslationIntegrationTest (1): drives the real on-premise engine
    with the 6 GB enterprise data file, then the two engines. With 8.8.8.8 and
    fr_FR the dropdown leads with the most probable country (Pakistan in this
    data) and is followed by all 250 countries alphabetically in French, with the
    code and name lists index-aligned. Skipped automatically when no data file is
    present. Run it with
    -DTestDataFile=<path to a .ipi> (or set 51DEGREES_IPI_PATH).

    JDK note: build this module with JDK 17. The repo's PMD-on-compile check uses
    an ASM that cannot parse JDK 25 class files.

Notes

  • Additive only. One new module plus one line in the parent pom <modules>.
    The on-premise engine and the cloud path are unchanged.
  • Cross-repo release dependency. This module depends on pipeline.translation,
    a new module in pipeline-java. ip-intelligence-java consumes pipeline at
    the published ${pipeline.version} (currently 4.5.10), while the local
    pipeline-java source is 4.5.7-SNAPSHOT. So:
    • The committed pom references pipeline.translation at ${pipeline.version},
      consistent with the other pipeline dependencies.
    • A clean build (and CI) is green only once pipeline-java publishes
      pipeline.translation in the pipeline line this repo targets. Merge or
      release the pipeline-java change first.
    • For local verification the module was built from the pipeline-java
      worktree and installed into the local Maven repo, then also installed under
      version 4.5.10 (a binary-compatible bridge, verified with javap) so this
      repo resolves it at ${pipeline.version}.
  • Follow-up: the on-premise getting-started web dropdown and the console
    example with an ordering test live in ip-intelligence-java-examples and
    consume this module.

Add a new ip-intelligence.translation module with the two country
translation engines, ported from the .NET FiftyOne.IpIntelligence.Translation
package.

Engine 1 (CountryCodeTranslationEngine) reads the weighted ISO country
codes from the IP Intelligence engine and translates them to English
country names using countrycodes.en_GB.yml. Its source element data key
is configurable and defaults to "ip".

Engine 2 (CountriesTranslationEngine) translates those English names to
the browser language using the countries.<locale>.yml files, then builds
complete ordered lists: the weighted countries first (most probable by
weight, ties broken by translated name), followed by every remaining
country alphabetically. The code and name "All" lists are index aligned so
a demo can render a country dropdown with the most probable country first.

The translation YAML files are wired in at build time from the IP
Intelligence data submodule rather than shipped as new data.
@Jamesr51d Jamesr51d requested a review from justadreamer June 25, 2026 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant